Loop Pipelining in Hardware-Software Partitioning
نویسندگان
چکیده
This paper presents a hardware-software partitioning algorithm that exploits a loop pipelining technique. The partitioning algorithm is based on iterative improvement. The algorithm tries to minimize hardware cost through hardware sharing and hardware implementation selection without violating given performance constraint. The proposed loop pipelining technique, which is an adaptation of a compiler optimization technique for instruction level parallelism, increases parallelism within a loop by transforming the structure of an input system description. By combining this technique with our partitioning algorithm, we can further reduce the hardware cost and/or improve the performance of the partitioned system. Experiments show about 19% performance improvement and 44% reduced hardware for a JPEG encoder design, compared to the results without loop pipelining.
منابع مشابه
Hardware/software co-design for DSP applications via the HMS framework
The design of computer systems that incorporate both standardized oo-the-shelf processors, or software, as well as specialized hardware is referred to as hard-ware/software (hw/sw) co-design. This paper studies the problem and presents a system, the Hardware/Multi-Software Co-design (HMS) system, of obtaining the best hw/sw connguration for DSP applications. New algorithms for performing partit...
متن کاملPartitioning and pipelining for performance-constrained hardware/software systems
In order to satisfy cost and performance requirements, digital signal processing and telecom-munication systems, are generally implemented with a combination of diierent components, from custom-designed chips to oo-the-shelf processors. These components vary in their area, performance, programmability and so on, and the system functionality is partitioned amongst the components to best utilize ...
متن کاملA Software Pipelining Framework for Simple Processor Cores
Current trends in many-core architectures show a switch from a small number of architecturally sophisticated cores (e.g. Intel Core2, IBM PowerPC) to many simple cores (e.g SiCortex and Tilera multiprocessor). These simple cores lack many of the advanced features of the complex cores (e.g. out-of-order execution, rotating register files, predication, speculation, etc.), which puts extra burden ...
متن کاملDatapath and memory co-optimization for FPGA-based computation
With the large resource densities available on modern FPGAs it is often the available memory bandwidth that limits the parallelism (and therefore performance) that can be achieved. For this reason the focus of this thesis is the development of an integrated scheduling and memory optimisation methodology to allow high levels of parallelism to be exploited in FPGA based designs. A manual translat...
متن کاملAdvances in Parallel-Stage Decoupled Software Pipelining Leveraging Loop Distribution, Stream-Computing and the SSA Form
Decoupled Software Pipelining (DSWP) is a program partitioning method enabling compilers to extract pipeline parallelism from sequential programs. Parallel Stage DSWP (PS-DSWP) is an extension that also exploits the data parallelism within pipeline filters. This paper presents the preliminary design of a new PS-DSWP method capable of handling arbitrary structured control flow, a slightly better...
متن کامل